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To extract reliable cosmic parameters from cosmic microwave background datasets, it is essential to 
show that the data are not contaminated by residual non-cosmological signals. We describe general 
statistical approaches to this problem, with an emphasis on the case in which there are two datasets 
that can be checked for consistency. A first visual step is the Wiener filter mapping from one set 
of data onto the pixel basis of another. For more quantitative analyses we develop and apply both 
Bayesian and frequentist techniques. We define the "contamination parameter" and advocate the 
calculation of its probability distribution as a means of examining the consistency of two datasets. 
The closely related "probability enhancement factor" is shown to be a useful statistic for comparison; 
it is significantly better than a number of quantities we consider. Our methods can be used: 
internally (between difi'erent subsets of a dataset) or externally (between different experiments); for 
observing regions that completely overlap, partially overlap or overlap not at all; and for observing 
strategies that difi'er greatly. 

We apply the methods to check the consistency (internal and external) of the MSAM92, MSAM94 
and Saskatoon Ring datasets. From comparing the two MSAM datasets, we find that the most 
probable level of contamination is 12%, with no contamination only 1.05 times less probable, 50% 
contamination about 8 times less probable and 100% contamination strongly ruled out at over 2 x 10^ 
times less probable. From comparing the 1992 MSAM flight with the Saskatoon data we find the 
most probable level of contamination to be 50%, with no contamination only 1.6 times less probable 
and 100% contamination 13 times less probable. Our methods can also be used to calibrate one 
experiment off of another. To achieve the best agreement between the Saskatoon and MSAM data 
we find that the MSAM data should be multiplied by (or Saskatoon data divided by): l.OGlg ^g. 



I. INTRODUCTION 

The cosmic microwave background (CMB) is black 
body radiation with a mean temperature of 2.728 ±0.002 
K [01 . This mean is modulated by a dipole due to our 
peculiar motion with respect to the radiation field. If one 
removes the dipole, the temperature is uniform in every 
direction to ±100 /iK. Precision measurement of these 
tiny deviations from isotropy can tell us much about the 
Universe 

Unfortunately, precision measurement of 100 /^K fluc- 
tuations is not an easy task. Even given sufflcicnt detec- 
tor sensitivity and observing time, one still has to con- 
tend with many possible contaminants such as side lobe 
pickup of the 300° Kelvin Earth and atmospheric noise 
(even from high-altitude balloons). In addition there can 
be contamination of CMB observations by astrophysical 
foregrounds. 

Despite these difficulties there is good reason to be- 
lieve that, at least for some experiments, the signals 
observed from sub-orbital platforms are not dominated 
by contaminants. One of the best reasons for believ- 
ing this comes from the comparisons that have been 
done— between FIRS and DMR |], Tenerife and DMR 
I i I , MSAM and Saskatoon , two years of Python data 
i I , and two flights of MSAM . Especially for the case 
when data being compared are from two different instru- 



ments, almost the only thing their acquisitions have in 
common is that they were observing the same piece of 
sky-each dataset has entirely different sources of system- 
atic error. 

In addition to conflrming the astrophysical origin of 
the estimated signal, comparison can greatly improve the 
ability to detect foreground contamination. Perhaps the 
best evidence for the thermal nature of anisotropy comes 
from the comparison between the MSAM92 and Saska- 
toon datasets. Together, these observations span a fre- 
quency range from 36 GHz to greater than 170 GHz. In 
it was found that the spectral index /3 {6T oc [u/uqY) 
is constrained to be /? = -0.1 ± 0.2. For CMB, free- 
free and dust over this frequency range we expect /? = 0, 
— 1.45 and 2.25, respectively. The authors conclude that 
the signals (in the region of overlap) are not dominated 
by contamination from known astrophysical foregrounds, 
but are, rather, primarily CMB. 

We should not let this apparent success fool us into 
thinking that going to the next level of precision will be 
easy. There is a big difference in the level of toleration 
of contaminants when the goal switches from detection 
to precision measurement. It is likely that there will be 
significant levels of contamination (from the atmosphere, 
side lobes, and foregrounds) in future sub-orbital mis- 
sions. It may be difficult to convincingly demonstrate 
that contamination is low without comparison. 
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Given the importance of comparison, we feel it is worth 
improving upon the methods used previously. Past treat- 
ments have had to ignore much relevant data, and make 
uncontrolled approximations. This is due to the fact that 
generally the two datasets being compared were obtained 
from instruments observing the sky in different ways. 
The beam patterns and differencing schemes may differ 
as in the case of the MSAM/Saskatoon comparison. In 
1^ one of the MSAM differencing schemes was approxi- 
mately recreated in software in order to do the compar- 
ison. However, no use of software could change the fact 
that the MSAM and Saskatoon beam patterns, although 
they have fairly similar full- widths at half-maximum, dif- 
fer significantly in shape. Even when the differencing 
schemes and beam patterns are the same, there can still 
be barriers to a direct comparison. The two MSAM 
flights took data with essentially the same beam pattern 
and applied the same differencing, but in this case the di- 
rect comparison is frustrated by the fact that the pixels 
do not all line up exactly. Therefore in , pixels within 
half a beam width of each other were approximated as 
being at the same point, and those pixels with no partner 
from the other dataset within this distance were ignored. 
Half of the data were lost this way. 

Here we develop methods of comparing datasets that 
do not require any information to be thrown away. Dif- 
ferences in demodulation schemes, and effects due to non- 
overlapping pixels are automatically taken into account. 
The inevitable price we pay for this is model-dependence. 
However, we generally expect the model-dependence to 
be small and indeed find it to be so in the case studies 
shown here. 

An extremely useful tool for visual comparison is the 
Wiener filter. Roughly speaking, it allows us to interpo- 
late the results from one experiment onto the expected 
results for another experiment that has observed the sky 
differently. After some notational preliminaries in section 
II, in section HI we introduce the Wiener filter in the con- 
text of the probability distribution of the signal, given the 
data. Also in this section we describe the datasets and 
apply the Wiener filter to them. 

When comparing datasets we are testing the consis- 
tency of our model of the datasets. We emphasize that 
meaningful model consistency testing demands the exis- 
tence of other models with which to compare. Therefore 
we extend our model of the data to include a possible con- 
taminant and calculate the probability distribution of its 
amplitude, given the data. We find a more limited exten- 
sion of the model space to also be useful, in which we only 
consider one alternative to no contamination: complete 
contamination. We define the "probability enhancement 
factor" as the logarithm of the ratio of the probability 
of no contamination to the probability of complete con- 
tamination. This Bayesian approach to comparison is 
described and applied in section IV. 

In section V we discuss and apply frequentist tech- 
niques such as tests. The probability enhancement 
factor can also be used as the basis for a frequentist test — 



and it is in fact the well-known likelihood ratio test. We 
demonstrate that the probability enhancement factor has 
more discriminatory power than any of the other tests 
considered. 

After a further look at the data with the probability 
enhancement factor in section VI, we discuss the fixing of 
relative calibration in section VII and possible contami- 
nation due to dust in section VII. Finally we summarize 
our results in section IX. 



II. PRELIMINARIES 

Before moving on to a discussion of the various statis- 
tics to be used in comparing datasets, we give some re- 
view which will serve to define our notation, following 
Ref. 1. 

In general, CMB observations are reduced to a set of 
binned observations of the sky, or pixels, Ai, i = I . . . N 
together with a noise covariance matrix, Cnw ■ We model 
the observations as contributions from signal and noise, 

A, = s, + ni (2.1) 

We assume that the signal and noise are independent 
with zero mean, with correlation matrices given by 

Ctw = (siS»'); Cmi' = {niUi,) (2.2) 

so 

{A,A,) ^ CT^^' + Cn^^' (2.3) 

where (. . .) indicate an ensemble average. With the fur- 
ther assumption that the data are Gaussian, these two- 
point functions are all that is necessary for a complete 
statistical description of the data. 

One important complication to the above description 
comes from the existence of constraints. Often the data, 
Ai, are susceptible to some large source of noise, or a 
not-well-understood source of noise that contaminates 
only one mode of the data. For example, there may 
be an unknown offset in the data. In this case, the 
average is usually subtracted from A^. Similarly, the 
monopole and dipole are explicitly subtracted from the 
all-sky COBE/DMR data, because the monopole is not 
determined by the data and the dipole is local in origin. 
In general, placing any constraint on the data or some 
subset thereof, such as insisting that its average be zero, 
results in additional correlations in A^ . We take this into 
account by adding these additional correlations, Cc, to 
the noise matrix to create a "generalized noise matrix," 
Cjq, where Cm — Cn + Cc- In the limit that the ampli- 
tude of Cc gets large, this is equivalent to projecting out 
those modes which are now unconstrained by the data, 
but we find this scheme simpler to implement numeri- 
cally. Thus in the text below we always write the noise 
matrix as Cn instead of C„ . The details of this proce- 
dure for handling the effect of constraints are explained 
ini. 
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Due to finite angular resolution and switching strate- 
gies designed to minimize contributions from spurious 
signals (such as from the atmosphere), the signal is gen- 
erally not simply the temperature of the sky in some di- 
rection, T(x), but a linear combination of temperatures; 



St 



(fx H,{x)T{x) 



(2.4) 



where Hi(x) is sometimes called the "beam map", "an- 
tenna pattern" or "synthesis vector" . If we discretize the 
temperature on the sky then we can write the beam map 
in matrix form, Si = Y.n HinTn- 

The temperature on the sky, like any scalar field on a 
sphere, can be decomposed into spherical harmonics 



(2.5) 



If the anisotropy is statistically isotropic, i.e., there are 
no special directions in the mean, then the variance of 
the multipole moments, aem, is independent of m and we 
can write: 



(2.6) 



For theories with statistically isotropic Gaussian initial 
conditions, the angular power spectrum, Ci, is the entire 
statistical content of the theory in the sense that any 
possible predictions of the theory for the temperature of 
the microwave sky can be derived from it 0. 

The theoretical covariance matrix, Ctw , is related to 
the angular power spectrum by 



2^+1 
47r 



where 



(2.7) 



(2.8) 



is called the window function of the observations and 
9nn' is the angular separation between the points on the 
sphere labeled by n and n' . 

Within the context of a model, the Ce depend on some 
parameters, Op, p = 1 . . . Np which could be the Hub- 
ble constant, baryon density, redshift of reionization, 
etc. The theoretical covariance matrix will depend on 
these parameters through its dependence on Ci. We can 
now write down the probability distribution for the data, 
given the model parameters, Op: 



* Non-linear evolution will produce non-Gaussianity from 
Gaussian initial conditions but this is quite sub-dominant for 
£ < 1000. 



P{A\CT{ap)I) 



(2^)^/2|Cr(ap) + Cw|i/2 
exp^-^A^ {CT{ap) + CNr' A 



(2.9) 



The / here stands generically for information — in 
this case the information that the noise is Gaussian- 
distributed with zero mean and variance Cn- 



III. WIENER FILTERS 



A. Derivation 



Bayes' theorem 

P{s\AI) = P{s\I)P{A\sI)/PiA\I) 



(3.1) 



follows from elementary rules of probability. If we take 
P{s\I) to be a Gaussian distribution with zero mean and 
covariance Ct and P{A\sI) to be a Gaussian with mean 
s and variance Cn then with a little algebra it follows 
that the probability distribution for the signal, given the 
data, Ct and Cn, is: 



P(s\A,Ct,Cn) 



exp 



i(s-u;A)^M-i(s-w7A) 



[(27r)^/2|M|i/2] 



(3.2) 



where M = {{s — wA) (s — wA)^) — Ct — wCt and 

w = Ct {Ct + Cn)'^ (3.3) 

is the Wiener filter [Q. As one can immediately see 
from Eq. (3^), the most probable value of the sig- 
nal is given by wA. As with all Gaussian distribu- 
tions, this most probable value is also the mean: s = 
J sP{s\A,CT,CN)ds = wA. 

Thus the Wiener filter operating on the data provides 
us with the most probable estimate of the underlying 
signal. Of course, this is the most probable signal only 
once we assume a power spectrum, C;, which is used to 
calculate Ct • Fortunately this model dependence is quite 
weak: the Wiener filter provides a robust estimate of the 
underlying signal provided theories are not chosen which 
are clearly incompatible with the data. 

The Wiener filter can be very helpful for visualizing 
the underlying signal. For example, often the data are 
oversampled; that is, there are closely spaced data points 
with plenty of scatter and large error bars. In a sense, 
the Wiener filter knows that the high spatial frequency 
scatter is due to noise and not signal and performs a 
smoothing of the data — an interpolation controlled by 
the different statistical properties of the noise and signal. 
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One can also use the dataset to calculate the most 
probable signal in some other datasetj^; let's call the two 
datasets Ai and A2, where the subscripts refer here to 
the entire appropriate data vector, not the single element 
at a particular pixel. Before getting to P(s2|Ai), we 
describe some notation for joint datasets. We represent 
the total data vector as 



Ai 
A2 



(3.4) 



This vector will have a total covariance matrix 



(AAt) = 



( {A,A\) (AiAt) 



Ctu + Cnii 

Ct21 



Ct12 
Ct22 + C'Ar22 



(3.5) 



where Crij represents the theoretical covariance between 



the pixels of experiments i and j , and Crij = C, 



Tji- 



We 



will also define Cy = Cxij + C^ij- We assume that 
the experiments have no common noise sources and thus 
Cni2 = 0. 

With this notation established we can now write 
P(s2|Ai,C't,Cw) 

_ CXp(-^(a2-t"2lAi)tA./-l (52-^21 Ai)) (3.6) 

~ [(27i-)«|M|]i/^ 

where M = Ct22 — W2iCti2, 

W21 = Ct2i {Ctii + Cnii) ^ (3.7) 

and we refer to the Wiener-filtering of dataset one "onto" 
dataset two. 

Thus Wiener-filtering provides us with an excellent 
tool for visual comparsion of datasets. Even if each 
dataset is expressed in different generalized pixels, since 
we can Wiener filter one onto the other, we can com- 
pare the signal predictions in the same space. We will 
see applications of this following the next section, which 
describes the MS AM and Saskatoon datasets. 

The Wiener filter can also be derived without reference 
to anything other than the two-point correlation func- 
tion of the signal and noise. Assume we want w to be 
such that the variance ((s — wA){s — wA)'^) is minimal. 
Differentiating with respect to Wij, setting to zero and 
solving for wij results in w — Ct{Ct + Cn)^^ ■ Thus the 
minimum- variance estimate of the signal does not depend 
on the Gaussianity of the signal and noise distributions. 
Although, of course, the uncertainty in the estimate does 



^In Ref. the Wiener filter was used to calculate the most 
probable signal in the Tenerife data, given the COBE/DMR 
data. 



One final expression we will need below is the proba- 
bihty distribution for the data itself, A2 (as opposed to 
the signal in the second dataset) given Ai and relevant 
matrices. It is the same as the above after changing S2 
to A2 and M to M + Cn22- 



B. Applications 

For Gaussian signal and noise, the Wiener filter pro- 
vides the maximum-Likelihood reconstruction of the sig- 
nal; it is also optimal in the minimum- variance sense dis- 
cussed above. One can construct a Wiener filter from the 
pixelized data space onto the same space or from the pix- 
elized data space to any other linear combination of map 
pixels — such as the map pixels themselves. Wiener filter 
maps have been made for the SK dataset ||l2| and the 
COBE/DMR dataset Hi. Map-making though is not the 
most useful means for comparing observations that are 
not themselves maps, and it is not suggested by the sta- 
tistical techniques we discussed earlier. Here we Wiener 
filter onto the experimental pixel space itself. 



1. Description of the datasets 

Before jumping into the applications to the Saskatoon 
and MSAM datasets we must describe them. They have 
considerable spatial overlap and similar angular resolu- 
tions. Otherwise, however, the two datasets are very 
different and a comparison provides a strong check on 
systematic errors. 

MSAM is a balloon-borne bolometric instrument with 
approximately half-degree (fwhm) resolution in 4 fre- 
quency bands centered at 170, 280, 500 and 680 GHz 
The data, at each frequency, are binned into pixels on the 
sky with two different antenna patterns, H , referred to 
as 2-beam and 3-beam or single-difference and double- 
difference (see corresponding window functions in Fig. 
|l|). Simultaneously, long time-scale drifts are removed 
which has the effect of introducing off-diagonal noise cor- 
relations. From this multi-frequency data, a fit is made 
to temperature fluctuations about a 2.73K black-body 
component and the optical depth of a dust component. 
The dust is assumed to have a temperature of 20K and 
emissivity that varies with frequency to the 1.5 power. 

The MSAM instrument flew in 1992 ||||, 1994 |l|] 
and 1995 [p^ . In each year a narrow strip of sky with 
nearly constant declination was observed. The purpose 
of the 1994 flight was to confirm the results from the 
1992 flight and so each targeted the same strip of sky at 
5 — 82° (see Fig. 1). Note that, due to, for example, 
imperfect pointing control, the two flights have slightly 
different sky coverage. The final fiight in 1995 observed 
near declination 5 = 80.5°, chosen to be sufficiently far 
away from the first two fiights for the signal correlations 
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to be negligible. Therefore we do not consider the 1995 
flight any further in this paper. 

The SK data are reported as complicated chopping 
patterns (i.e., beam patterns, H, above) in a circle of 
radius about 8° around the North Celestial Pole. The 
data were taken over 1993-1995. Here we only use the 
1995 data which were taken with angular resolution 0.5° 
FWHM at approximately 40 GHz. More details can be 
found in 

The bulk of the data were in the "cap" configuration: 
constant-elevation scans tracing out curved rays from the 
pole, which were then binned in RA and subjected to 
various sinusoidal demodulation templates in software. 
Some of the 1995 data (0.5° beam), however, were taken 
in the "ring" configuration, which isolated the data taken 
aX 5 = 82°, put into 96 RA bins, and then subjected to 
3, 4, 5 and 6 point sinusoidal demodulations, this time 
along lines of constant declination. The ring data window 
functions are in Fig. |l|. The region of overlap of the SK95 
ring data with the MSAM data can be seen in Fig. ||. 

The calibration of the SK dataset was performed by 
comparing with the star Casseiopia A. however, this 
star's 30-40 GHz flux itself is poorly determined; hence, 
the original SK dataset was reported with a 14% calibra- 
tion error. More recently, Leitch [|o| in turn used the 
very well-determined amplitude of the GMB dipole itself 
to determine the flux of Gas A; this has resulted in a 
5% increase in the temperature of the SK data (and er- 
rors), with a reduced calibration error of 7% (the flux of 
Gas A itself is now determined to 5%, but there are 
additional sources of calibration error |1^). Except for 



Section VII , in the following we do not include the effects 
of calibration uncertainty. 




FIG. 1. 
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multipole moment I 
The diagonal elements of the window function ma- 
trix Wiij for the four SK ring antenna patterns (solid) and 
the two MSAM antenna patterns (dashed). These show how 
the pow er sp ectrum contributes to the variance of the data 
(see Eq. pjl). 
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FIG. 2. Observation locations. The SK RING data covered 
the entire circle of radius 8 degrees around the NCP. The 
centers of the MSAM92 (MSAM94) pixels are indicated with 
triangles (squares). 



2. Wiener-filtermg MSAM92 

An example of Wiener filtering with Eq. ( |3.3| ) is shown 
in Fig. ||. The data points are the values of the pixelized 
data, located horizontally according to the right ascen- 
sion of the center of the pixel. The dependence of the pix- 
els on declination and twist has been suppressed. The er- 
ror bars are from the diagonal part of the (non-diagonal) 
noise covariance matrix. The central curve is the Wiener- 
filtered data and the bounding curves indicate the 68% 
confidence region for the signal. Because of the differ- 
ence between the noise covariance matrix and the sig- 
nal matrix, the Wiener filter essentially assumes that the 
high frequency behavior is noise and therefore smooths 
out the data. This smoothing is complicated by the off- 
diagonal noise correlations which explains some apparent 
disagreements between the data and the Wiener-filtered 
data. For example, around 20 hours in the top panel, the 
Wiener-filtered data are consistently above a number of 
the data points. 
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RA (hours) 

FIG. 3. An example of Wiener filtering. The points with 
error bars are the MSAM92 pixelized data. Two-beam in top 
panel, 3-beam in bottom panel. The three curves in each 
panel are the Wiener-filtered data bounded by ± one stan- 
dard deviation. 

The Wiener filter is model-dependent — one must know 
(or assume) covariance matrices for the noise and signal. 
Presumably the noise covariance matrix is well-known 
and so the model-dependence resides in the choice of 
angular power spectrum. Of course, we can gain some 
knowledge of the angular power spectrum by perform- 
ing a likelihood analysis of the data. The Wiener filter 
is generally quite robust to changes in the angular power 
spectrum that are smaller than those that significantly al- 
ter the likelihood — even large changes usually have very 
little effect. We demonstrate this robustness here with 
Fig. ^ which shows the Wiener-filtered data for a stan- 
dard CDM spectral shape and also for a flat spectrum 
{Ce = constant). 




C. Wiener-filtering MSAM94 onto MSAM92 

Besides Wiener filtering the data onto its own pixel 
space, we can Wiener filter it onto another pixel space 
(Eq. 3.7). This provides an excellent visual tool for check- 
ing compatibility of results. We show this first for the 
Wiener filtering of MSAM94 onto MSAM92, together 
with MSAM92 onto MSAM92 from the previous subsec- 
tion. Notice that in Fig. ^ the 68% confidence regions 
mostly overlap each other. One can see the MSAM94 re- 
gion get wider at either extreme in RA. This is because 
the MSAM94 pixels have a slightly shorter RA extent 
than the MSAM92 pixels (14.9'' to 20. l'' compared to 
M.S"" to 20.31^). 
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RA (hours) 

FIG. 5. Wiener filters onto 1992 pixels for 1992 data (ver- 
tical lines) and 1994 data (horizontal lines). The curves are 
realizations consistent with the 1994 data. Two-beam in top 
panel, three-beam in bottom panel. 

One can see in the figure that many features are seen by 
both datasets; they agree quite well. The most significant 
differences between the two estimates of the signal are in 
the region of 15.5 hours for the 2-beam signal and 14.5 
hours for the 3-beam signal. We will discuss these slight 
anomalies later. 



D. SK95 onto MSAM92 and MSAM92 onto SK95 

Figure ^ shows the same thing as Fig. ^ except that 
MSAM94 has been replaced with Ring95. Once again, 
the first impression is of general agreement, although the 
discrepancies here (at large RA) appear to be more sig- 
nificant than those seen in the MSAM92/MSAM94 com- 
parison. 
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FIG. 4. Wiener filter model-dependence for MSAM92. The 
standard CDM (flat) spectrum was assumed for the sohd 
(dashed) curve. 
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FIG. 6. The 1992 (vertical lines) and 1995 data (horizontal 
lines) Wiener-filtered onto the 1992 pixels. 

We can also filter the MSAM92 data onto the four Ring 
templates, as shown in Fig. 0. We have chosen the range 
of this plot to extend in RA further than the MSAM cov- 
erage. This allows one to see how the constraint behaves 
outside of the region of MSAM's influence. Notice that 
the errors don't become infinite. This is because of the 
prior information that went into the estimate of the prob- 
ability distribution, i.e., the assumed power spectrum. 
Also note that the data have some influence shghtly be- 
yond the limit of the sky coverage. The dominant reason 
for this is the spatial extent of the antenna patterns. In 
addition, the intrinsic correlations (assumed in the prior) 
extend the influence to slightly beyond where the antenna 
response is zero. 
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FIG. 7. Wiener filters onto 1995 pixels for 1992 data (ver- 
tical lines) and SK 1995 data (horizontal lines). 

With two dimensional Wiener filter maps (as with any 
two dimensional map) it is difficult to plot both the map 



and a confidence region expressing the level of uncer- 
tainty as we have done here for essentially one dimen- 
sional data. In 2D it is therefore often useful to show, 
in addition to the mean signal, several realiz atio ns con- 
sistent with its probability distribution (Eq. 



3.2 or P.6 



Looking at several realizations allows one to see which 
features are significant and which aren't. Realizations 
can also be useful in the ID case to make up for the fact 
that the confidence region does not contain any informa- 
tion about correlated uncertainties. For the applications 
here, though, we have not found them to be useful and 
so have not shown any. 



IV. BAYESIAN COMPARISON 

A natural question to ask is, "How consistent are the 
two datasets?" . The Wiener filter gives a visual, qualitia- 
tive answer to the question, but we would also like some 
quantitative answers as well. A better-formulated ques- 
tion is, "Is my model of the data an adequate description 
of the two datasets together?" . To answer this question, 
one can extend the model of the data to include a resid- 
ual and then check to see if this extension increases the 
likelihood. For example, one could add a residual that is 
Gaussian-distributed with zero mean: 

Ai = + Hi + n 

{AiAj) = CT,ij + CnA] + C'rcs.ij- (4.1) 

Further restrictions on the form of CrcsAj must be made 
for the problem to not be degenerate. One could choose 
Crcs.ij to be appropriate for a particular foreground con- 
taminant |2^, increased noise or anything else that 
inspection of the data, combined with prior knowledge, 
has led the analyzer to suspect. Below we describe a par- 
ticular choice of Crcs,ij that is useful in the absence of any 
hints as to the likely nature of a possible contaminant. 



A. The contamination parameter, 7 

To test the consistency of the pairs of datasets - or 
rather, to test the adequacy of our model of the datasets 
- we introduce the following residual: 



Ai = si + ni + 7iri 



(4.2) 



and likewise for A2. To reduce the number of parameters 
in this model for the residual, we set 7 = 71 =72- Now 
we must specify the probability distribution of r. For 
simplicity, let's take it to be a Gaussian random variable 
with zero mean. Clearly we want the cross-term in the 
variance to be zero ((riTj) = 0) since we have in mind 
contaminants that are particular to each dataset. There 
is a lot of freedom in the choice of {rir\) and {r2rl) — 
once again for simplicity let's take these to be equal to 
Ctii and Ct22- 
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We have just added one parameter to whatever other 
parameters we were using to define the power spectrum. 
The model for the power spectrum we use here is stan- 
dard CDM, with the ampHtude as the only free param- 
eter. We have expressed the amplitude as cts — the rms 
fluctuations in mass in 8h^^ Mpc spheres. The experi- 
ments in question do not have sufficient dynamic range 
to strongly constrain more than this one parameter. For 
COBE-normalized standard CDM, as = 1.2. 

We can now explicitly show the complete parameter 
dependence of the covariance matrix in our model, by 
modifying Eq. (3.5) to: 



^ ^ / ct| (1 + 7^) pTii + Cjvii ct|Cti2 

V CrlCT21 O"! (l + 7^) Ct22 + C'Ar22 

where the tilde means the quantity is evaluated for erg = 
1. 

We prefer to work with a slightly different parameteri- 
zation (spanning the same model space) by replacing (t| 
with (cTg)^ = cr|(l -1-7^) which is the amplitude for the 
variance of the signal and the contaminant combined. We 
prefer a'g to ag, since its probability distribution of this 
quantity should be relatively independent of the level of 
contamination. Further, we prefer to use the fraction 
of contamination, 7/-^/ (1 -I- 7^) rather than the contam- 
ination parameter itself. Probability distributions for cTg 
and (1 + 7^) can be seen in Fig. ^. 




7/(1+7'^)^^^ froctional contamination 

FIG. 8. Contours of the likelihood of erg vs. the fractional 
contamination for the MSAM92 and MSAM94 datasets (top 
panel) and the MSAM92 and SK95 datasets (bottom panel). 
The contours indicate reductions in probability from the max- 
imum by factors of ^^,e^ etc. 

One can see from the shape of the contour curves that 
7/-y/(l -|- 7^) and CTg are very nearly uncorrelated. The 
reason is that the dominant contribution to the determi- 



nation of CTg comes from terms in the likelihood propor- 
tional to A^Aj where A^ and Aj are in the same dataset, 
whereas 7 is entirely determined by the cross-terms. 

The most probable level of contamination indicated by 
the MSAM92/MSAM94 comparison is about 12%. How- 
ever, there is virtually no evidence for non-zero contami- 
nation since the probability of zero contamination is only 
about 5% less. Complete contamination is strongly ruled 
out at more than exp(5^/2) ~ 2.7 x 10^ times less prob- 
able. The MSAM92/SK95 datasets are much less con- 
straining on the amount of contamination that may be 
present. While 50% is the most probable value, total 
contamination and no contamination are only about 13 
and 1.6 times less likely respectively. 

Tne residual, as we have modeled it here, is a partic- 
ularly difficult one to constrain since it very nearly has 
the same statistical properties as the signal. We note that 
this is desirable in the sense that the ability to constrain 
the residual comes entirely from the comparison - that 
is, each dataset, by itself, has no constraint on the frac- 
tional contamination. Thus this model for the residual is 
a strong test of the agreement between the two datasets, 
rather than anything internal to them. 



B. The probability enhancement factor, /3 

For many purposes, a much smaller extension into al- 
ternative hypothesis space may be useful. In particular, 
instead of examining a continuum, one could just com- 
pare the model with 7 = to the model with 7 = 00, 
at fixed CTg. The interesting quantity is how much more 
probable one model is than the other, a quantity referred 
to as the odds. This particular odds, or rather its log- 
arithm, we refer to as f3 and call it the probability en- 
hancement factor: 



/3 = In 



P(AiA2|go) 

P(AiA2|i?oo) 



(4.4) 



where Hq (not to be confused with the present value of 
the Hubble constant!) is the hypothesis that 7 = and 
iJtx) is the hypothesis that 7 = 00. Both hypotheses are 
understood to be fixed at the same . One can see from 



Eq. 4.3 that the cross-terms connecting the two different 
datasets in the covariance matrix C vanish when j —^ 00 
with (Jo fixed. Therefore we can also write (3 as 



(3 = In 



P(AiA2|C) 
P{Ai\C)P{A2\C) 



= In 



P(Ai|A2,C) 
P(Ai|C) 



(4.5) 



where C is understood to be C in Eq. 4.3 with 7 = and 



the second equality follows from the use of P{AB\C) = 
P{A\BC)P{B\C). This second equality gives rise to an- 
other interpretation of /3: /3 indicates how much more 
probable dataset 1 is given that dataset 2 exists than it 
would be without the existence of dataset 2. And by 
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the symmetry of the definition of (3 we know that the 
statement is true under switching of 1 and 2. ^ 

The probabihty enhancement factor, hke the Wiener 
filter, depends on the assumed power spectrum used to 
calculate the theoretical covariance matrices. We find 
that for our parametrized model, within the most likely 
region of parameter space, the dependence of (3 on the 
parameter is weak. In Fig. ^ we see the dependence of 
/3(92,95) and /3(94,95) on crg-l Notice that it doesn't 
make much difference to /3 whether one uses the best fit 
value of tTg given by one of the two experiments or by 
the joint likelihood — or indeed by anything in the 68% 
confidence region because the dependence of (3 on cts is 
quite flat in this region. 




FIG. 9. Probability enhancement factor /3(92, 95) (top 
panel) and /3(94, 95) (bottom panel) as a function of us (solid 
curves). Also shown are —6\nC for individual and joint dat- 
sets. Identifying these curves by their minima, they are, from 
left to right: MSAM92, MSAM92+SK, SK in the top panel 
and MSAM94, MSAM94+SK, SK in the bottom panel. 

As can be seen from the log likelihood curves, the dif- 
ferent datasets prefer slightly different values of csQ. For 
all calculations of (3 below and for the Wiener filtering in 
the previous section we have chosen a value of cs — 1-2, 
in between the preferred values for SK and MSAM. It is 



"'■There is even a third interpretation of /3 as the log of the 
ratio of probability of no relative pointing error, to that of a 
gross relative pointing error which leaves the fields completely 
uncorrelated. 

^To be precise, we mean o-g but in the following we drop this 
prime for simplicity and also because keeping the prime does 
not make sense in the context of the interpretation of (5 as the 
increase in probability of one dataset given the other dataset. 

**Some of this discrepancy may be due to calibration un- 
certainty which is not included in these log likelihood curves. 
We address this issue in a later section. 



also the normalization for this power spectrum suggested 
by the DMR data. 



V. FREQUENTIST STATISTICS 

We now discuss /? from the frequentist perspective. 
The frequentist approach to checking the consistency of 
a dataset is to invent some function of the data, called 
a statistic, and then to compare the measured value of 
the statistic to its probability distribution under various 
hypotheses. The probability enhancement factor, /3, can 
be viewed as a statistic since it is a function of the data. 
In fact, it is the logarithm of the well-known likelihood 
ratio statistic — in this case the ratio of the likelihood of 
Hq to the likelihood of Hoc- 

Some statistics are better than others at distinguishing 
among competing hypotheses. In this section, we see 
how (3 and other statistics fare at discriminating between 
hypotheses Hq and Hao- 



A. Probability distributions of quadratic statistics 

We restrict ourselves to studying quadratic functions 
of the data, for which we have analytic expressions for the 
mean and variance. In addition to various different 
quantities (to be defined below), the probability enhance- 
ment factor — due to the logarithm in the definition — is 
also a quadratic function of the data: 

/3=(Ar/2)ln|C| + iA^C-iA 



(A^i/2)ln|Cn|--A^Cf/Ai 
(A^2/2)ln|C22|-iAfC2-2iA2. 



(5.1) 



which follows from Eq. 4.5. Since it is a quadratic func- 



tion of the data, it is straightforward to calculate the 
mean and variance. 

In general, any quadratic function of the data, Q = 
A^Af A + constant, has a mean under hypothesis X of 

Qx = {Q)x = Tr {CxM) constant (5.2) 

and a variance of 

8Q\ = {{Qx - Qf)x = 2Tr {CxMCxM) (5.3) 

where hypothesis X is specified by Cx = (AAt)x. 
For the case of (3 we have, for hypotheses Hq and Hoo'. 



1, f\Cll\^'\C2 



■In 



1^2 



\c\ 



N 



(/3>o) )o = Tr(u.i2«;2i) 

(/3)oc = (/3)o + ^Tr(l-C7ooC-i) 

- (/3)oo)')oo = ^Tr [(1 - C^C-^) (I - CooC-i)] (5.4) 
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Note that if the experiments have nothing to do with each 
other (Ci2 = 0) then the numerator and denominator of 
the argument of the logarithm are equal and there fore 
(/3)o = as we expect from the definition of /3 in Eq. 4.4. 



Given the observed value of /3, we can assess the valid- 
ity of the two hypotheses by calculating the probability 
distribution of /3 under each hypothesis. As shown above 
we can calculate the mean and variance analytically. To 
calculate the entire (non-Gaussian) distribution function 
though, we have used the Monte Carlo method. The 
results are plotted in Fig. |l^ for the three possible pair- 
ings of the three datasets. The Monte Carlo method 
is quick because we first rotate to a basis where every- 
thing is diagonal and then make the realizations. The 
rotation to the diagonal basis only needs to be found 
once. The plots shown use between 4000 and 17000 re- 
alizations. Notice that the distribution of /3 under Hq is 
well-approximated by a Gaussian. The deviations from 
Gaussianity are larger under i?oo- 




10 20 

FIG. 10. The measured values of /3 (vertical lines) and 
its (arbitrarily normalized) probability distribution functions 
under the two hypotheses. From top to bottom: /3(92, 94), 
/3(92, 95) and /9(94, 95). The curves peaking at positive l3 are 
estimates of P{l3\Ho) and those peaking at negative /3 are esti- 
mates of P{f3\H ao)- The points with error bars are the results 
of a Monte Carlo calculation while the solid curves are Gaus- 
sians with the analytically calculated means and variances. 

We see in the figure that /?(92, 94) = 13 which is consis- 
tent with the expected range for hypothesis of 15 ±4.1. 
As a measure of the consistency, we have calculated the 
probability of getting a /3 greater than this to be 0.70. We 
also see that under hypothesis Hoo such a value of /3 is 
extremely unlikely; the probability of getting a (3 greater 
than the measured one is less than 1%. We also find 
consistency with iJp for the other two pairs of datasets: 
/3(92,95) = 2.1 (c.f. (/3)o = 7.4±3.2) and /3(94,95) = 2.4 
(c.f. (/3)o = 4.4 ± 2.6). For both of these, under hypoth- 
esis i?oo, the probability of getting a value of (3 as high 
or higher than the measured one is 1%. 



B. Comparison of comparisons 

There are a handful of other quadratic functions of the 
data one might consider using for comparison of datasets. 
Here we define the ones under consideration by specifying 
the data vectors on which they are based: 





A 


(5.5) 




A-wA 


(5.6) 


"X-wl 


(Ai - W12A2) 


(5.7) 


Xw2 


(A2 - W21A1) 


(5.8) 


Xwl2 


(zi;i2A2 - t«iiAi) 


(5.9) 


Xw21 


(W21A1 - ti;22A2) 


(5.10) 



We clarify what we mean by two examples: 

Xj = AtM~iA (5.11) 

where M = {AA^)o = C, and 

(W12A2 - wiiAiY M^'^{w 12 A2 - wiiAi), 

(5.12) 



Xwl2 



where 

M = ((W12A2 - wiiAi) (w;i2A2 - ■u;iiAi)''')o 

= {Wii - W12W21) CtiI + {Wi2 - W11W12) Ct21- (5.13) 

The J stands for joint, since this is the quantity in the 
joint likelihood function, P{A\C). It is straightforward 
to show that Xj = X™? but, other than this relationship, 
the above x^s are all independent quantities. 

To judge the discriminating power of all our quadratic 
statistics, we use the separation factor, 



\Qo-Qoo\/SQo, 



(5.14) 



where Qo, Qoo are the means under the two hypotheses 
and 6Qq is the standard error under Hq. The separa- 
tion factor is shown as a function of trg in Fig. ^ . To 
avoid clutter, only two of the quantities are shown, 
Xj and Xwi2- The separation factors for the other x^s 
are bounded by these two. 

One can clearly see the superiority, under this mea- 
sure, of the Bayesian-motivated probability enhancement 
factor. For example, for as = 0.6, if we assume Hq, it 
requires an 8a fluctuation to get /3 = (/3)oo but only a 2a 
(3cr) fluctuation to get Xj = (Xj)oo lxli2 = (xS,i2>oo)- 
The increase in all the separation factors with increasing 
as is expected since discriminating power should increase 
with increasing signal-to-noise of the measurements. 
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sigmaB 

FIG. 11. Separation factors for f3 (blue, solid), Xwi2 (green, 
dashed) and Xj (magenta, dot-dashed). The top panel is for 
the 92/94 comparison and the bottom panel for the 92/95 
comparison. For Xwi2 the smaller dataset is taken to be 
dataset 1. 

The separation factor, as we have defined it, is the sep- 
aration between the expected value of the two hypotheses 
in units of the standard error assuming Hq (SQo)- One 
might also choose as another measure of discriminating 
power, this separation in units of the standard error as- 
suming (SQao)- In showing that /3 performs well 
under this measure we are assisted by a theorem: the 
likelihood ratio test is most powerful. 

A simple hypothesis test can be made from any statis- 
tic by choosing some critical value Q*: it Q > Q* then 
reject Hq] otherwise, accept Hq^. Statisticians discuss 
the size and power of a test designed to discriminate be- 
tween two hypotheses. The size of the test is the prob- 
ability of rejecting Hq if Hq is true while the power is 
the probability of rejecting Hq if Hoo is true. Clearly, we 
want the test to be such that the size is small and the 
power is large. By changing Q* we can choose the size. 
The test based on the likelihood ratio statistic has the 
property that, for a given size, it is most powerful — that 
is, no other test with the same size has a greater power. 
For a discussion of the likelihood ratio statistic in the 
context of CMB observations see, e.g., p^ . 

To see the relevance with our separation factor, let's 
set Q* = Qq. Let's further assume that we are in the 
asymptotic limit of large datasets so that all probability 
distributions are Gaussian. With Q* — Qq, the size of 
the test will be 0.5 for all statistics. Since the size of 
this test is the same for all statistics, we know that the 
likelihood ratio test {Q — f3) will have the largest power. 
For Q* = Qq the power is given by 



^'This assumes Qo < Qoo, if not then the test should be 
changed so that Ho is rejected when Q > Q* . 



power = 1/2 + [ °° exp (Q - Qocf / {2SQl) 

= 1/2 + erf ((go - Qo.) I^'25QI,) /2. (5.15) 

Since the error function monotonically increases with its 
argument, we see that the separation between Qq and 
Qoo in units of 5Qoo will always be largest for the likeli- 
hood ratio statistic, (3. 

We end this section with a brief consideration of one 
more quantity. One could ask if there is a set of map 
pixels, T, that is consistent with the noise distribution: 

X^^(A-s)tC^i(A-s) ■ s^HT (5.16) 

Because of its model independence, one might also think 
that Xn is a compelling choice for testing the consis- 
tency of two datasets. However, if the pixels for the two 
datasets are slightly different, then it will almost cer- 
tainly be the case that a set of map pixels can be found 
that gives a reduced Xn near unity. The problem is that 
this sky map may contain sharp spikes, highly inconsis- 
tent with our prior assumptions. 

VI. APPLYING p TO SUBSETS OF DATA 

We have also calculated (3 for various pairings of sub- 
sets of the data; the results are in the Table. All but one 
pairing (to be discussed later) have values of (3 within 2a 
of {(3)q. Note that the last 8 rows of the table are the 
results for internal consistency checks. 

Also included in the table are the values of xS)i2- 

Un- 
der the separation factor criterion, this was the best other 
quadratic statistic. It is also of particular relevance to 
Figures 0, ^ and ^ since these show the data vectors on 
which ')Q,i2 is based. 

Most of the reduced x2;i2 values are comfortably close 
to unity. The probability of exceeding x^ is less than 
5% for only one of the entries — the 95_4,95_5 internal 
consistency check for which the probability is less than 
1%. 
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datasets 


/3 


{f3)o ± S(3o 


(/3>oc ±(5/3oo 


2 7 — 

Xwl2/f 




92,94_2 


5.7 


10.9 ±3.6 


-37.6 ±20.0 


1.08 


218 


92,94_3 


14.5 


11.2 ±3.6 


-39.0 ±20.1 


1.05 


218 


92,94 


12.8 


15.0 ± 4.1 


-58.4 ± 27.4 


1.02 


218 


92,95.3 


-2.5 


4.4 ±2.5 


-8.5 ±5.4 


1.12 


218 


92,95.4 


4.6 


3.2 ± 2.2 


-5.3 ±3.6 


1.11 


218 


92,95.5 


-1.2 


1.6 ±1.7 


-2.031 ± 1.4 


1.05 


218 


92,95.6 


-0.29 


0.56 ± 1.03 


-0.61 ±0.49 


1.06 


218 


92,95 


2.13 


7.4 ± 3.2 


-15.6 ± 8.1 


1.15 


218 


94,95.3 


2.6 


2.71 ± 2.08 


-4.27 ±3.05 


1.02 


170 


94,95.4 


1.4 


1.94 ± 1.82 


-2.69 ±2.01 


1.05 


170 


94,95.5 


-0.31 


0.99 ± 1.35 


-1.14 ±0.85 


1.06 


170 


94,95.6 


-0.99 


0.35 ± 0.82 


-0.365 ±0.29 


0.96 


170 


94,95 


2.437 


4.4 ±2.63 


-7.29 ±4.2 


1.05 


170 


92.2,92.3 


8.29 


8.80 ±3.185 


-31.6 ± 18.775 


1.16 


109 


94.2,94.3 


6.81 


11.1 ± 3.418 


-52.4 ±30.4 


0.93 


85 


yo_o,yo_4 


1 TO 


D.z zt z.yy 


7 1 O -I- Q 1 O 


i.zo 


yo 


95.3,95.5 


0.65 


1.3 ± 1.59 


-1.40 ±0.62 


1.09 


95 


95.3,95.6 


1.29 


0.39 ±0.87 


-0.40 ±0.216 


1.08 


95 


95.4,95.5 


-1.03 


2.14 ±2.00 


-2.46 ± 1.19 


1.48 


95 


95.4,95.6 


2.26 


0.39 ± 0.88 


-0.40 ±0.18 


1.05 


95 


95.5,95.6 


-0.10 


0.232 ±0.68 


-0.24 ±0.13 


1.08 


95 



TABLE I. The probability enhancement factor is symmet- 
ric under the interchange of the two datasets but xS)i2 (de- 
fined in Eq. 7.9) is not so we must specify that the datasets 
column has the format dataset 1, dataset 2. 



We have also found another breakup of the data to be 
useful. To identify localized problems in the data, we 
have calculated /3 as a function of the amount of data 
included. For example, in Fig. we have plotted 

/3(92,95*) vs. a*, where the star in 95* indicates that 
only 95 data with RA a < a* have been included. One 
can see here features associated with the discrepancies 
seen in the Wiener filter figures. Figures 13 and |4| show 



the results of similar calculations. For Fig. T^, the order 
in which the data is included is reversed (see caption) in 
order not to overemphasize the discrepant data at low 
RA. 




3pl+4pl+(5pt) 



3pt+4pt + 5pl.+ (6pL) 
14 16 



18 20 22 24 

a* (hours) 

FIG. 12. The probability enhancement factor /3(92,95*), 
where the * indicates that only data with right ascension, a, 
less than a* are included. In the top panel, only the 3 point 
data are included for 95. In the panel one lower, in addition 
to all the 3 point data those 4 point data with a < a* are 
included, etc. 
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FIG. 13. The probability enhancement factor ^(95,92*). 
the * indicates that only data with right ascension, a, less 
than a* are included. In the top panel, only the 2-beam data 
are included for 92. In the bottom panel, in addition to all 
the 2-beam data the 3-beam data with a < a* are included. 
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FIG. 14. The probability enhancement factor, /3(92,94*). 
where here * indicates that only data with right ascension, 
a, greater than a* axe included. In the top panel, only the 
3-beam data are being included for 94. In the bottom panel, 
all the 3-beam data are included, but only those 2-beam data 
points with a > a*. The dashed lines are (/3(92, 94*))o and 
one standard deviation above and below. 

From the first two entries of the Table and also from 
Fig. |lj we see that the 3-beam datasets are more con- 
sistent with each other than the 2-beam datasets, where 
there is a hint of a problem at low RA. This hint can 
also be seen in the Wiener-filtered data shown in Fig. ||. 
Possibly confusing is that in Fig. ^ the discrepancy looks 
stronger in the 3-beam Wiener-filtered data. However, 
this is because the 2-beam data has significant influence 
on the best estimate of the 3-beam signal. Evidence for 
this relevance of the 2-beam data for the 3-beam data 
(and vice-versa) comes from the fact that /3(92_2, 92_3) 
and /3(94_2, 94_3) are large at 8.3 and 6.8 respectively. A 
further clue that the problem is with the 2-beam data is 
in Fig. |l^ where there is a suggestion of a problem at 
low RA with the 2-beam but not the 3-beam. 

The better agreement between the 3-beam datasets 
than between the 2-beam datasets is possibly due to the 
greater susceptibility of the 2-beam data to atmospheric 
contamination. In particular, the 2-beam data is suscep- 
tible to atmospheric gradients while the 3-beam is not. A 
gradient can arise as the pendulating motion of the gon- 
dola causes the motion of the chopping flat to be slightly 
different from constant elevation [ pO[ . Presumably one 
could test this hypothesis by searching for signals in the 
time stream with the balloon pendulation period. 

Both from the Wiener filter figures and the cumula- 



tive P plot (Fig. IH) we can see that the MSAM92 and 
MSAM94 data agree very well at large RA and therefore 
what's observed is really on the sky and not some instru- 
mental artifact. In contrast, the MSAM92/SK Wiener 
filter figures and cumulative P plots show discrepancies. 
These may be due to instrumental problems — in which 
case the problem must be with SK95 — or foreground con- 
tamination which could affect either instrument. We dis- 
cuss the possibility of foreground contamination in sec- 
tion VIII on dust. 



VII. FIXING CALIBRATIONS BY COMPARING 
DATASETS 

Every dataset must be calibrated by using the same 
apparatus to observe a radiation source of known bright- 
ness. This observation allows for the conversion of the 
data from some arbitrary units to temperature or bright- 
ness units. Often the brightness of the source in the 
passband of interest is only known to 10% or so in which 
case the calibration is a significant source of uncertainty. 
If A' is the uncalibrated data then we define the calibra- 
tion factor / as A = /A', where A is the calibrated data. 
Similarly, the uncalibrated noise covariance matrix gets 
adjusted by C„ = pC'^, since the noise is determined 
from the data itself. 

One can do likelihood analysis on the uncalibrated 
data, but with the appropriate covariance matrix: 



c ^ (A'(A')T) = {{sif){sim + = ^ Ct + 



(7.1) 

For a joint dataset, Ai and A2, we have (dropping the 
primes) : 



C = 



\h) \h 



) Ctu + Cnll 



C' 



T21 



(ff)( 
(*) 



2 ' . 



7f) Ct22 



22 



Note that this covariance matrix, and hence the like- 
lihood, depends only on crs/Zi and o^j j^- The degener- 
acy among the three parameters is broken by the calibra- 
tion measurements of each experiment, which are usually 
modeled by a Gaussian: 



InAot (o-8,/r,/2) = ln£ 



^8 tis 

Ti'T2 



(A - h? (/2 - hf 



2al 



2a| 



(7.2) 



If one is exploring this likelihood space by direct eval- 
uation, note that one can first evaluate InC on a two 
dimensional grid (crg/Ji, cs/A) and then evaluate the 
three-dimensional ln£tot by adding in the (very easy to 
calculate) calibration measurement terms. The data as 
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we receive it has already been calibrated so we usually 
take / = 1. 

We have evaluated this In £tot for the MSAM92 dataset 
and a subset of the SK95 ring dataset, from 13'' to 22'*, 
covering the range of influence of the MSAM92 data. For 
MSAM92 we take /msam = 1 and ctmsam = 0.1. For 
SK we use the Netterfield et al. calibration /sk ~ 1 and 
f SK = 0.14. We find in this case that In £tot is minimized 
at /msam = 0.99, /sk = 0.99 and as = 1.13. If the 
Leitch recalibration is used (/sk = 1-05, ctsk = 0.07) 
then ln£tot is minimized at /msam = 1-02, /sk — 1-02 
and as = 1.13. 

We can also neglect the calibration measurements en- 
tirely and use the two datasets themselves to find the 
best relative calibration, /12 = /1//2. 

'Cl2(/l2) OC J dX2C{xi,X2)S{xi - fi2X2) (7.3) 

where Xi = fi/as and X2 = /2/o'8. We find that, once 
again restricting the SK95 data to between 13 and 22 
hours that /msam,sk = l-06j];26- Netterfield et al. ||] 
find from their analysis that /msam,sk = 1-22 ± 0.24. 

Note that there is a possible problem for joint power 
spectrum analysis if relative calibration uncertainty is not 
taken into account. For overlapping experiments neglect 
of this uncertainty could artificially boost high frequency 
power. 

VIII. DUST 

There is a marginally significant discrepancy between 
the MSAM observations and those of Saskatoon at large 
RA. This discrepancy is possibly due to foreground con- 
tamination of either the SK or MSAM datasets. This 
hypothesis is supported by the fact that the discrepancy 
occurs where the observations are closest to the plane of 
the galaxy. Further, from the MSAM interstellar dust 
data, one can see that the discrepancy occurs roughly 
where the dust is brightest — see Fig. |l^. 
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FIG. 15. Wiener-filtered dust. The points with error bars 
are the MSAM92 pixelized dust data. Two-beam in top panel, 
3-beam in bottom panel. The three curves in each panel are 
the Wiener-filtered data bounded by ± one standard de- 
viation (assuming Gaussianity!). The open squares are the 
results of convolving the IRAS data with the MSAM beam 
pattern; the scale is set by a fit to the MSAM data. 

Besides different spectral dependence from the CMB, 
interstellar dust also has a different spatial frequency de- 
pendence than the CMB. Schlegel, Finkbeiner, and Davis 
have used the DIRBE and IRAS maps to infer the 
power spectrum of interstellar dust. They find that away 
from the plane of the galaxy it has the shape Ci oc 
We have therefore used a power spectrum with this shape 
to Wiener fiher the MSAM dust data— see Fig. |l5|. The 
dust is also known to be highly non-Gaussian. While 
the mean signal does not depend on the statistics of the 
signal, the uncertainties in the signal do. Therefore one 
should bear this in mind when interpreting the graph 
since the error bars in the figure were calculated assum- 
ing Gaussianity. Along with the MSAM dust data is 
the result of convolving the MSAM beam pattern with 
the IRAS SISSA 240 micron map ||. The IRAS data 
have been scaled to fit the MSAM data. Note that the 
agreement for the 3-beam data is much better than for 
the 2-beam data, once again suggesting that it is a more 
reliable dataset. 

What we have referred to as the MSAM CMB and dust 
data are obtained by fitting each pixel of the four fre- 
quency channels (170GHz, 220GHz, 500GHz, 680GHz) 
of MSAM data to a CMB component and a dust com- 
ponent. From this fit we get the CMB temperature and 
dust optical depth. The dust is assumed to be a "grey" 
body with temperature T = 20K and emissivity index 
a = 1.5. 

Using this model, the dust feature at large RA should 
not be showing up in the lowest frequency channel. 
Therefore the fit ascribes the structure seen in the 
170GIIz channel to CMB. However, the model may be an 
inadequate description; there may be a component cor- 
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related with the dust with stronger emission at 170GHz 
than the thermal dust emission itself. Indeed, the shape 
of the dust feature at large RA is somewhat similar to 
the MSAM CMB feature at large RA. 

The Saskatoon data is single frequency and thus harder 
to directly check for contamination. Despite the low fre- 
quency, dust (or a source correlated with dust) contami- 
nation is a possibility. Several datasets point to a correla- 
tion between high frequency maps dominated by thermal 
emission from dust and lower frequency measurements 
|23|. A weak, but significant, correlation has been seen 
1 24 1 in a correlation analysis of the entire SK dataset and 



The cause of this 
1 an hypothesis has 



dust maps made by |£1|; also see [g5| 
correlation is not yet known althoug : 
been advanced by Drainc and Lazarian [^6| that it is due 
to dipole emission from spinning dust grains. 

To investigate this possibility, we have Wiener-filtered 
the MSAM92 dust measurements onto the SK95 data 
in the region of overlap with MSAM. For the shape of 
the dust power spectrum we used Cg oc l^'^-^ with 
amplitude chosen to maximize the likelihood given the 
MSAM dust data. Although, as expected, the dust is 
brightest in the region of the discrepancy, we have been 
unable to identify any more detailed relation between the 
predicted dust signal and the discrepancy. 
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FIG. 16. The MSAM92 dust data (heavy solid line), 
MSAM92 CMB data (light solid line) and SK95 data all 
Wiener-filtered on to the SK95 pixels. 

There is another reason for believing the problem may 
lie with the SK data. The SK team also did some inter- 
nal consistency checks on their data, one of which is the 
A-B test. Their A and B detectors measured orthogo- 
nal linear polarizations, and thus for an unpolarized, or 
weakly polarized, signal, A-B should be consistent with 
zero. However, for the region of overlap with MSAM, 
they find 'x\_bI^ — 1-55 for 80 degrees of freedom. The 
origin of this asymmetry is unclear, possibly an instru- 
mental artifact. It is probably too large to be explained 



by rotational emission from dust grains since Draine and 
Lazarian predict that this component of the dust emis- 
sion is only between 0.1% and 10% polarized. 



IX. SUMMARY 

We have demonstrated the usefulness of the Wiener fil- 
ter for making visual comparisons of datasets. We have 
emphasized that meaningful consistency testing requires 
alternative models with which to compare. Thus we have 
explicitly extended our model of the data to include a 
possible contaminant and calculated the probability dis- 
tribution of the amplitude of this contaminant. For pur- 
poses of extracting just one number from the comparison 
we advocate calculating the ratio of the probability of no 
contamination to the probability of infinite contamina- 
tion. Viewed as a statistic, we have shown this "proba- 
bility enhancement factor" to be better than various 
statistics at discriminating between competing hypothe- 
ses. 

The utility of our comparison statistics was shown by 
exercising them on the MSAM92, MSAM94 and SK95 
data. We have found from comparing MSAM92 and 
MSAM94 that the most probable level of contamina- 
tion is 12%, with zero contamination only 1.05 times 
less probable, and total contamination over 2 x 10^ times 
less probable. From comparing MSAM92 and SK95 we 
have found that the most probable level of contamina- 
tion is 50%, with zero contamination only 1.6 times less 
probable, and total contamination 13 times less probable. 
Looking at subsets of the data we find a region at large 
RA where the SK and MSAM measurements disagree. 
From IRAS and from the MSAM dust measurements we 
know that this region is also the dustiest region of the 
overlap between SK and MSAM. The origin of the dis- 
crepancy is unclear and may be due to instrumental ar- 
tifacts in SK, or foreground contamination of either the 
SK or MSAM measurements. 

A revolution is underway in the quality and quantity 
of CMB data — a revolution generated by the satellites 
MAP and Planck as well as by a number of balloon 
and ground-based programs. The amount of data may 
soon be too large for the type of complete statistical anal- 
ysis described here. However, any approximate methods 
developed for extracting the power spectrum or param- 
eters will also be applicable to the statistical procedures 
introduced here. 
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